Text document content characterization may be a component of document archival and retrieval systems. Multiple methods exist to both recognize text and analyze the thematic content of scanned text documents. This analysis is then used in a variety of ways for document archival, retrieval, and search mechanisms. A key to efficient archival and retrieval of documents is the manner in which the thematic content of those documents are displayed to the user of the system. This display must allow for the most precise identification of themes for optimal retrieval or efficient analysis. While a simple list of keywords or themes covered in a document(s) may be helpful, a conventional list lacks various capabilities and functionalities which would enable more efficient and precise analysis and retrieval.
There is a general need for document summarizing methods and systems that can facilitate optimal retrieval and efficient analysis of documents. It is believed that the methods and systems of the illustrative embodiments help meet this need.